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ASYMPTOTICS OF POSTERIORS FOR BINARY BRANCHING 
PROCESSES 
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Abstract 

We compute the posterior distributions of the initial population and parameter 
of binary branching processes, in the limit of a large number of generations. We 
compare this Bayesian procedure with a more naive one, based on hitting times 
of some random walks. In both cases, central limit theorems are available, with 
explicit variances. 
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1. Introduction 

This paper is devoted to some estimation procedures of binary branching processes 
in a Bayesian setting. To be more specific, let (X„) n >o denote a Galton- Watson process 
which starts from the initial population Xq > 1 and whose offspring is ruled by the 
distribution 

(1 - U) Si + U5 2 with0<[/<l, 

where 6 X denotes the Dirac mass at x. This means that, at every generation, each 
individual dies and is replaced by 1 or 2 individuals, with probability 1 — U and U 
respectively, independently of the fate of the other individuals, and that X n counts 
generation n. 

In a Bayesian framework, the initial population Xq and the offspring parameter U 
are both random and unknown. To keep things simple, we also assume that Xq and U 
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are independent, and we wish to estimate them from the observation of a finite path 
xi :n = {x k )i<k<n of the process X 1:n = {X k )i< k < n up to a given time n > 1. 

Well known motivations for such a study are various biological settings where one 
observes X\- n but Xq and £/ are unknown. One example is the modeling of polymerase 
chain reaction. Probabilistic models of polymerase chain reactions were proposed 
and studied by Sun (1995), Weiss and von Haeseler (1995) and (1997), Peccoud and 
Jacob (1996), Piau (2002), (2004), (2005), and Jagcrs and Klebaner (2003). Recently, 
Lalam and Jacob (2007) introduced and studied the Bayesian setting above, see also 
Lalam (2007). For other Bayesian approaches of branching processes, see Scott (1987), 
Prakasa Rao (1992), Mendoza and Gutierrez-Pena (2000), and, for the interesting 
model of bisexual branching process, Molina, Gonzalez and Mota (1998) for example. 
Finally, the idea of studying a branching process backwards, but to estimate its age 
rather than its initial population, is in Klebaner and Sagitov (2002). 

In models of polymerase chain reactions and in similar contexts, the initial pop- 
ulation Xo is the size of a small sample, extracted at random from a much larger 
population. This suggests that the initial population X should be Poisson distributed, 
say with parameter A. We assume that A is random as well. Jeffreys' principle, see 
Kass and Wasserman (1996), then indicates that the prior distributions of A and U 
should be proportional to measures which we compute below. To sum up the result of 
these computations, the prior of A is easy to write down but improper and the prior of 
U is awkward but proper. However, the posterior of (X , U) conditionally on X\- n is 
a proper distribution, which can be computed explicitly. In particular, this posterior 
distribution depends only on X\, X n and S n = X\ + ■ ■ ■ + X n . Unfortunately, it is also 
rather unwieldy. 

In such situations, one may rely on numerical algorithms, based on MCMC for 
example, to simulate the posterior distributions with any prescribed degree of accuracy. 
Rather, we look for simple asymptotics in realistic regimes. Namely, we assume that 
n is large and we are interested in the asymptotic posterior distribution of (X , U) 
assuming that X n is large and that the ratio S n /X n converges to a finite limit. This 
assumption is almost surely fulfilled by the paths of binary branching processes since 
these are supercritical. In this setting, we show that the posterior distributions indeed 
converge and we compute explicitly their limit. 
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2. Results 

To describe our results, we introduce some notations. Let xo :00 = {x n )n>o denote a 
sequence of positive integers. We say that such a sequence is admissible if, for every 
nonnegative n, x n < x n +\ < 2x n . We say that an admissible sequence is regular 
if furthermore, x n /s n converges to a positive limit when n goes to infinity, where 
s n = X\ + ■ ■ ■ + x n . The binary index B{xq- 00 ) of a regular admissible sequence x 0:oo 
is the real number in ]0, 1] defined by 

-B(x 0:O o) = linl ^ii. 

rwoo Sn 

The renormalized index R(xq :co ) of a regular admissible sequence xo :0 o is the real 
number in [0, +oo[ defined by 

R{xo-.oo) = hm — . 

n^oo 4x n+ iS n 

Almost every (sequence which can be realized as a) path of a binary branching process 
is admissible and regular. The renormalized index is a function of the binary index, 
namely i?(a;o:oo) = g(B(xo-.oc)) where, for every u in ]0, 1], 

g{u) = -is— 

The binary index and the normalized index are asymptotic quantities, in the sense 
that, for every nonnegative integer n, the indexes of a regular admissible sequence 
^0:oo do not depend on the first values xo :n . 

From now on, letters k and n are used to enumerate generations of the process (that 
is, the time) and symbols x, x^, x n and y are used to measure population sizes. 

Definition 1. (Distributions.) For every positive real number r and every positive 
integer x, the finite discrete measure u(r, x) and the discrete probability measure 
(i(r,x), both on the positive integers, are defined by 

v{r,x)= V \\\ V )r v 6 y , fJ,(r,x) = ^ ,X \, ■ 

For every positive integer x, the integer h(x) in the formula above is the upper half of x, 
that is, the smallest integer such that 2h(x) > x. In other words, h(2x) = h(2x—l) = x 
for every positive integer x. 
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Our main result is as follows. 

Theorem 1. (Posterior distributions.) (1) The path Xo :0 o of a binary branching 
process with parameter U is almost surely regular admissible and its binary index is 
almost surely B(Xq :oc ) = U. 

(2) Assume that the prior distribution of (X , U) satisfies Jeffreys' principle. Then, 
for every regular admissible sequence X\ :00 with binary index u = B( 

^i:oo) in ]0? 1[; the 

posterior distribution of (Xq,U) conditionally on X\ :n — X\- n converges when n goes 
to infinity to the distribution fi(g(u), xi) £g> 5 U . 

Theorem [T] shows that the limit posterior distribution of Xq when n goes to infinity 
is almost surely fi(r,x) with r = g(U) and x — X%. Unless r = 0, r = 1 or x = 1, 
/i(r, x) is not degenerate, hence the value of Xq can be determined only with some 
uncertainty, even from an infinite trajectory X 1:oc . On the contrary, U is a function of 
the infinite trajectory X 1:oo . 

The limit distribution fi(g(u),xi) <g) S u in theorem [T] converges to the Dirac distri- 
bution at (xi,0) when u converges to and to the Dirac distribution at (h(xi), 1) 
when u converges to 1. Our next result describes the intuitively obvious variations of 
/i(r,x) with respect to r and x. First, since r = g(u) is a decreasing function of u and 
the offspring distribution of the branching process is stochastically increasing with u, 
one should expect x) to increase stochastically when r increases. Likewise, since 
x represents the population at time 1, one should expect /z(r, x), which represents the 
population at time 0, to increase stochastically when x increases. 

We recall that a measure [i\ is stochastically larger than a measure \i2 if and only 
if Hi([z, +oo)) > H2([z, +oo)) for every real number z. 

Proposition 1. (Ordering of limit posterior distributions.) For every positive integer 
x, the family (fi(r, x)) r >o is stochastically increasing. For every positive real number 
r, the family (/i(r, x)) x >\ is stochastically increasing. 

We now characterize the limit of /Lt(r, x) for every fixed value of r, when x converges 
to infinity. 

Theorem 2. (Limit posterior distributions of initial populations.) Fix u in ]0, 1[. For 

every positive integer x, let ^ x denote a random variable with distribution fi(g(u),x). 
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When x converges to infinity, the expectation and the mode of £ x /x both converge to 

m u = 1/(1 + u), 

and the random variables (£ x — m u x) / ^/x converge in distribution to a centered Gaus- 
sian distribution with variance 

al=u{l-u)/{l + uf. 

For the sake of comparison, we turn to another natural way to estimate initial 
populations of branching processes with known offspring distributions, based on hitting 
times. To describe this in the setting of binary branching processes, we first introduce 
some notations. 

Definition 2. (Hitting times.) Fix a real number u in ]0, 1[, and let (e x ) x >\ denote a 
sequence of independent Bernoulli random variables with distribution (1 — u) 8\ + u 62- 
For every positive integer x, let a x := e\ + • • • + e x . Define the distribution of the 
hitting time r\ x by the relation 

V(r, x = y) = P(<t„ = x I H x ), where H x = {3z > 1 ; a z = x}. 

When the value of u is known, an estimation procedure of Xq based on X\ — x is to 
propose the value y for Xq with probability P(r) x = y), thus an estimator of Xq when 
X\ = x is the distribution of r\ x . 

Recall that m u = 1/(1 + u) and a\ = u(l — u)/(l + u) 3 . 

Theorem 3. (Initial populations through hitting times.) Fix a real number u in ]0, 1[. 
For every positive integer x, 

- m u x\ < 2u/{l + u) 2 < 1/2. 

Furthermore, when x converges to infinity, (r) x — m u x) / y/x converges in distribution 
to a centered Gaussian variable with variance a 2 . 

The rest of the paper is organized as follows. We prove theorem[T]and proposition Q] 
in section[3]and theorem[5]in sectionHJ Finally, the proof of theorem [31 sharper bounds 
on K(r) x ) and a brief comparison with another, non Bayesian, estimation procedure are 
in section [5] 
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3. Posterior distributions 

3.1. Preliminaries 

Jeffreys' principle, see Kass and Wasserman (1996), indicates that the prior measure 
for a parameter 6 governing the distribution vq of a random variable Z should have a 
density proportional to J(6) 1 ' 2 , where 

J(9) = -E e (JL\ogp e (Z) 

We apply this to the parameter (A, U). Parts of lcmma[T]arc in Lalam and Jacob (2007). 

Lemma 1. For every positive integer n, the prior measure for (A, £7) according to 
Jeffreys' principle and based on Xq :u is the product of the prior measures for A and 
U . The prior measures for A and for U are respectively proportional to the measures 
dA/VA on A > and ir n (u) du on < u < 1, where 



n n (u) 



(1 +u) n - 1 



y u 2 (i — u) 

In particular, the prior of U is proper. 

Proof of lemma\j\ Assume that Xq is Poisson distributed with parameter A and 
that Xq : „ is a binary branching process with parameter U. Then the distribution va,u 
of Xg-n is such that 

\Xq n / \ 

VA,v{x ;n) = e- A ^- TT Xk ' X )tP»-**-i(l_tf)2»»-i-*». 

Up to a factor C(xo :n ) which does not depend on (A, U), logVA.u{ x o-.n) is 

-A + x logA + (x n - x Q ) \ogU + (s„ - 2x n + 2x )log(l - U) + C(x 0:n ). 

This is the sum of a function of A and a function of U , hence the prior measures are 
product measures. As regards the prior for A, 

d 2 x f . E a (Xq) 1 

^ log v A (.t ) = -jz, hence J(A) = ^ = -. 

As regards the prior for U, 
d 2 

^2 \ogvu(xQ.. n ) = -(x n - x )/U 2 ~ (s n - 2x n + 2x )/(l - U) 2 , 
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hence 

Eu(X n -X ) Eu(S n -2X n + 2X ) 



Jn{U) 



u 2 (i-^) 2 

where S n = X\ + ■ ■■ + X n . Since ¥*u(Xk) = (1 + U) k ~E.(Xo) for every nonnegative 
integer k, one finds that J n (U) = E(Xo)TT n (U) 2 with the notations of the lemma. 
Finally, up to multiplicative constants, 7r„(u) behaves like l/\/u when u converges 



to and like l/y/l — u when u converges to 1. Hence, 7r„ is integrable and there exists 
a (proper) prior distribution for U. This concludes the proof of lemma [1] 

From now on, we fix a positive integer n, we assume that the observations are 
X\-. n = Xi :n with xi :n = (xk)i<k<n and we recall that s n = X\ + • • • + x n . The 
posterior distribution in lemma [2] is similar, but not equal, to a posterior distribution 
computed in Lalam and Jacob (2007). 

Lemma 2. The posterior distribution of (Xq, U) conditionally on X\ :n = x\ :n depends 
only on x\, x n and s n , and is proportional to the measure 

x=h(xO ^ X ' ^ X) 

Proof of lemma\^ Fix u, x\- n and x such that h{x\) < x < x%. Then, the condi- 
tional probability P(U € dit, Xq — x \ X\. n — Xi :n ) is proportional to 



vu(Au) J ^ A (dA)P A (X = x)P u (X 1:n = x Un | X = x), 
where ^(du) = ir n (u)c\u and z^\(dA) = dA/VX Hence, 

Likewise, using the computations in the proof of lemma [TJ one gets 



¥ u (X 1:n = x 1:n \X =x) = C{x l:n ) (1 - u) s 

\xi - x) 

where C(x\ :n ) does not depend on {x,u). This concludes the proof of lemma [21 

3.2. Proof of theorem [1] 

Part (1) follows from the fact that, when n converges to infinity, X n /(1 + U) 1 
converges almost surely to a random positive and finite limit. 
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A sketch of the proof of part (2) is as follows. Consider the distribution in lemma [5] 
and assume that x n converges to infinity and that x n /(s n — x n ) converges to v. Then 
Sn — x n is equivalent to x n /v, hence 



u Xn (l ~ u ) s "- 2x " = (u v {l - u) 1 - v ) 



Sn—x n -\-o{x n ) 



The inner parenthesis is maximal when u = v, and the exponent converges to infinity, 
hence this contribution becomes concentrated around the value u — v. The remaining 
factor involving u in the distribution described in lemma[2]is g(u) x , and the convergence 
to /j.(q(v), X\) follows. 

For a detailed proof of part (2), we consider a sequence xi :oa such that x n converges 
to infinity and x n /(s n — x n ) converges to v. For every positive integer n, we introduce 
random variables {T ni U n ) distributed as (Xq,U) conditionally on X± :n = x\-. n . We 
first show the convergence in probability of U n , then the convergence in distribution 
of (T n ,U n ). 

Lemma 3. With the notations above, U n converges to v in probability. 
Proof of lemma{3^ Lemma [2] yields 

P(T„ = x, U n S du) = c n p x g(u) x b n (u)q n (u) du, 

where c„ denotes a normalizing constant which is independent on x and u, p x depends 
only on x and x\, b n (u) depends only on u, x n and s n , and q n {u) depends only on 
u and n. More precisely, for every integer x such that x\ < 2x < 2x\ and every real 
number u in ]0, 1[, 

Px ( x ) (xi - x) ' 

bn (u) = u -n-l/2(l _ u )-»-* e »-l/2 j 



/(l + u)»-l 

q n {u) = ■ 

We aim to show that, for every integer x such that p x is positive and every positive 
real number z, when n converges to infinity, 



P(T n = x, \U n -v\>z)< P(T n = x). 
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Since the function q n is nondecreasing, 

P(T„ = x, \U n -v\>z)< c n p x q n (l) / g(u) x b n (u) 1[ ,i](m) du, 

J \u—v\>z 

and 

P(T„ = x) > (0) / e (u)*6 n (u)du. 

Jo 

The ratio of the two integrals written above is P(|-B„ — v\ > z), where B n is a beta 
random variable of parameters (a„,/3„), with 

a n =x n -x + 1/2, (3 n = s n - 2x n + 2x+ 1/2. 

Since a n and (3 n both converge to infinity and a n / (a n +(3 n ) converges to v, it is an easy 
matter to show that B n converges in probability to v. However, we need a stronger 
statement, namely the fact that F(\B n — v\ > z) <C q n (0)/q n (l). Note that q n (0) = \fn 
and q n (l) - 2"/ 2 , hence g„(0)/«„(l) < 1. 

One can write an elementary proof of this, based on the representation of beta 
random variables with integer parameters as ratios of sums of i.i.d. exponential ran- 
dom variables and on large deviations properties of these sums. Instead, we rely on 
approximations of beta distributions by normal distributions provided by Alters and 
Dinges (1984). A rephrasing of corollary 1 on page 405 of this paper is as follows. 
Let (Yfc)fc denote a sequence of beta random variables of parameters (ka,k,k(l — Ofe)). 
Assume that k converges to infinity and that at converges to a limit < a < 1. Then, 
for every fixed y such that a < y < 1, the ratio 

nY k > y) 

f(z> y/2ke(a k ,yj) 

converges to a finite and positive limit, which depends on a and y only, where Z denotes 
a standard Gaussian random variable, and £ denotes the function defined by 

£(a, y) = a log (^j + (1 - a) log 

Since afc converges to a and £(a, y) is a continuous function of a, standard estimates 
of Gaussian tails and the result by Alters and Dinges show that there exists a positive 
constant C < 1, independent on k, such that for every k large enough, 

p(n > y) < c k . 
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Applying this to our setting, first to the random variables B n and to y = v + z, then to 
the random variables 1 — B n and to y = 1 — v + z, one gets the existence of a constant 
C < 1 such that, for every n large enough, 

P(|B„-u| > z) < 2C Q "+' 3 ". 

Since a n + f3 n — s n -i + x + 1 > s„_i 3> n, 2C Q ™ + ^™ <C <7„(0)/<?„(l), and the proof of 
lemma [3] is complete. 

We now apply lemma [3] to the proof of part (2). Introduce the finite sums 



For every u in ]0, 1[, the distribution of T„ conditionally on U n = u is independent on 
n and such that 



The function u i— » p(u) _1 p x ls(u)p(M) :;c is bounded by 1 on ]0, 1[ and, as soon as v is 
not in the boundary of £?, continuous at u = v. Since U n converges in distribution to 
v, this implies that P(T„ = x, U n € B) converges to p(v)~ 1 p x 1b(v)q(v) x , for instance 
for every interval B = [0, u] with u/s. This is equivalent to the desired convergence 
in distribution. 

3.3. Remarks 

For every positive integer n and every admissible sample, s n > 2x n (l — 1/2") since 
Xk > Xk+i/2 for every nonnegative integer k, hence s n — x n > x n + o(x n ) and u < 1 
in the asymptotics that we consider. Furthermore, the function g decreases from 
g{0+) = +oo to q(1~) = 0. 

The measures n(r,x) for the first values of x are as follows: /i(r, 1) = 5x, 




X 



P(T r 



x\U n = u)= p(u) 1 p x g{u) 



Hence, for every measurable subset B of ]0, 1[, 



P(T„ = x, U n eB)=E (piUn^pABiU^giUnf) . 



Si + 3r<5 2 
1 + 3r 



/z(r,3) 



3<5 2 + 5r(5 3 
3 + 5r 



M (r,4) 



3<5 2 + 30r<y 3 + 35r 2 <5 4 
3 + 30r + 35r 2 



and 



li(r, 5) 



15<5 3 + 70r<5 4 + 63r 2 <5 5 
15 + 70r + 63r 2 
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3.4. Proof of proposition [T] 

The monotonicity with respect to r is valid in a wider setting, described in propo- 
sition [5] below, but the monotonicity with respect to x is more specific. 

Proposition 2. Let /j, denote a nonzero bounded measure with exponential moments. 
For every real number a, introduce the measures v a and fi a defined by the relations 
v a {dx) = e ax fi(dx) and fi a — v a j\v a \. Then the family (fi a )a is stochastically nonde- 
creasing. 

Proof of proposition^ Fix x. The derivative of fx a ([x,+oo)) with respect to a has 
the sign of D(x), with 



The integral in the right hand side is a nonincreasing function of x. Since -D(O) = 
-D(oo) = 0, the function x i— *■ D(x) is nondecreasing for x < x a and nonincreasing for 
x > x a , where x a solves the equation 



This proves that D{x) > for every x, hence /j, a ([x,+oo)) < fib([x, +oo)) for every 
a < b. This concludes the proof of proposition [2l 

We turn to the monotonicity of n(r, x) with respect to x. We fix a value of r and 
write every v(r, x) as 




The variations of D(x) with respect to x are given by 



dD(x) = e ax /i(dx) / (y - x)e ay [i(dy). 





v 



We want to prove that for every x, G{z) > for every z, with 



a; 



x+l 

•y 



a 



x+l 

'y 



y y>z 



y>z y 



One sees that G(0) = G(oo) = 0, and simple computations show that 
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At this point, we use the specific form of the coefficients a®, which yields 

< +1 =2 (2x + l)(2z-x) 
a x z (x + l)(x+ 1 - z)' 

This shows that {F{z)) z is a nonincreasing sequence, hence G(z + 1) — G(z) > if 
z < and G(z + 1) — G(z) < if z > z*, for a given z*. Hence the sequence (G(z)) z 
is nondecreasing on z < z, and nonincreasing onz>z,. Since G(0) = G(oo) — 0, this 
implies that G{z) > for every positive z. This concludes the proof of proposition [TJ 

4. Limit posterior distributions of initial populations 
4.1. Expectations 

Let u in ]0, 1[ and r — g(u). We are interested in the limit as x — * oo of the sequence 



.x xB(r,x) 
with the notations 

y y 

and 



2y 



B(r..)-Kr..)l 
Definition 3. For every positive A and r, introduce 

C A (r, *) = (1 - 4rz(l + z))~ A = £ c A (r, x) z x . 
Starting from the expansion 



(1-4^ = ^ 



(T>0 



2x 



2 a 



one can write B(r,x) as the coefficient of z x in the expansion of Ci/2(r,x) along the 
powers of z, namely, B(r,x) = Ci/2(r,x). Likewise, A(r,x) is r times the derivative 
of B(r,x) with respect to r, hence A(r, x) is the coefficient of z x in the expansion of 
2rz(l + z)C 3 / 2 (r, x) along the powers of z. This yields 

A(r, x) = 2r (c 3/2 (r, x - 1) + c 3/2 (r, x - 2)) , 
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x B(r, x) = 2r (c 3/2 (r, x - 1) + 2c 3/2 (r, x - 2)) . 
Definition 4. For every positive r, introduce 



1 i+r \ , . 1+7(0 1/ r^ 7- 

7(0 = o h/ 1 . m(r) = - , JV. = - 1 1 



2 \ V r I 1 + 27(r) 2 \ V 1 + r 
Note that, for every u in ]0, 1[, 

m 1 

l(e{u)) = , m{g(u)) = — — = m u . 

1 — u 1 + u 

Lemma 4. For every positive X and r, when x converges to infinity, 

c A (r, x) ~ c A (0 x^ 1 7 (0" x , c A (0 = m(r) A /r(A). 

Proof of lemma^4\ This is a consequence of known expansions of powers of 1/(1 — z). 
First, recall that 

(!-,)-» -£>(.),■, tw- ^m -w 

We use this and the decomposition 

1 - 4 " (1 + *' = ( 1 -^) (' + ^y+r). 

to get the expansion 
which implies 



c x (r,x) = dx{ xMr)-* ± ( -f^-Y MV) dxiX ~ V) 



; n , 7 (0 + 1J ^' d x (x) ■ 

When x converges to infinity, the ratios d\(x — y)/d\{x) converge to 1, hence, by 
dominated convergence, 

c A (r,,)^d A (x)7(O^E(^n)"^(0 = rfA(07(0^ (i + T^y 
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where the equality stems from the definition of the coefficients d\(-). Plugging the 
equivalent of d\(x) into this and using the fact that f + 7(1")/ (1 + 7(7)) = l/ TO ( r )> one 
deduces lemma HJ 

Lemma 0] for A = | yields that, when x converges to infinity, there exists a constant 
a, whose value is irrelevant, such that 

A(r, x) ~ 2ra x 1/2 -f(r)~ x -f(r) (1 + j(r)), 

and 

xB(r,x) - A(r,x) — 2ra x 1/2 j(r)~ x j(r) 2 . 

Hence (xB(r,x) — A(r,x))/A(r,x) converges to "f(r)/(l +7(V)), and 

A(r,x) l + 7(r) 

— — - converges to — ^— = m(r). 

xB{r,x) 5 l + 2 7 (r) v ' 

This is the desired convergence of the expectations because, as mentioned above, the 
relation r = g(u) means that m(r) — m u . 

4.2. Modes 

To study the mode of t; x , one compares v(r,x)(y + 1) to v(r,x)(y). The ratios 

v(r, x)(y + l) = {y + 1/2) (x-y)r 

v{r,x)(y) (y + l-x/2)(y+l/2-x/2) 

are the terms of a nonincreasing sequence indexed by y. Writing y as y — x (l + s)/(2s) 
with s > 1, when x is large, one gets 

v(r,x)(y + l) r(fi 2_ 1) 
v(r,x)(y) 

This implies that the sequence x){y)) y is increasing on y < y* and decreasing on 
V > V*, f° r a value of such that = x (1 + s*)/(2s*) + o(x) with = 1 + 1/r. 
Finally, this shows that, when r = q{u), the mode of fi(r,x) is at + u) + o(x). 

4.3. Distributions 

Our next computation is based on characteristic functions. Fix u in ]0, 1[ and let 
r = g(u). For every positive integer x, introduce 

£ x - xm s 

IT 



F x (t)=E[ cxp t 
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Recall that 



1 + 7 (r) 1 {l-uf 
m(r) = - , : , _ — = J- , r = 



l + 27(r)' l + 2j(r) V 1 + r ' 4 " 
Since E(exp(>£ x )) = ^(re 4 , z)/5(r, x), 

F x (t) = e _t ^ m B(re*/^, x)/B(r, a:). 

We turn to the study of the sequence of functions (£?(•, a;)) x >i. 

Since B(r,x) — ci/ 2 (r,x), a consequence of lemma |4] is that, when x converges to 
infinity, 

B[re*/^,x) ( 7 (r) \ x 5(a;, t/v^) 



where, for every s, 

S(x,s) = V ; g \ , -i J ^i/2 !/H — r^— ■ 

We get rid of the fraction involving S(x,t/y/x) through lemmas [5] and [5] 

Lemma 5. For every nonnegative x and y, di/2(y) d\/2(x) < 0^/2(2; + y). 

Proof of lemma\5i A probabilistic proof is as follows. For every nonnegative x, 
dx/i{x) = 2~ 2x ( 2x ) is the probability that a simple symmetric random walk on the 
integer line is at its starting point after 2x steps. Hence ^1/2(2; + y) is the probability 
that the random walk is at its starting point after 2x + 2y points and d\/2{y) d\/ 2 {x) 
is the probability that the random walk is at its starting point after 2x steps and also 
after 2x + 2y points. The latter event being included in the former, this shows the 
desired inequality. 

Lemma 6. When x converges to infinity, S(x,t/ \fx) converges to m(r) 1 / 2 . 

Proof of lemmaUH Since S(x, 0) converges to S(oo, 0) = m(r) 1 ' 2 when x converges 
to infinity we show that S(x,t/y/x) — S(x,0) converges to 0. By lemma [5l the ratios 
of coefficients di/ 2 involved in S(x,t/s/x) and S(x, 0) are bounded by 1. Adding terms 
such that y > x + 1, one gets \S(x,t/y/x) — S(x,0)\ < T(t/y/x), where 



+00 

T{t/y/x) = J2 

v=o 



r y(re t '^ x ) \ ( 7(r) 



7(re*/V^) + 11 \7(r) + 1 
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All the terms in the sum have the same sign, hence 

+00 / , 1 1 /=, \ y 



E 



7(re 



—i \ 7(re*/ v 7 ^) + 1 



7(r) 



k 7(r) + 1 

One can compute the sum of each geometric series. This yields 

\S{x,t/y/x)-S{xM <T(t/Vx) = |7(re 4/ ^)-7W 
which proves the lemma since 7(-) is a continuous function. 
Lemma |5] shows that 



B(r,x) y j(r) J 
The rest of the proof is standard. A Taylor expansion of 7(-) around r yields 
7 ( re */Vi) = 7 ( r ) + ( e W5 _ 1) 7 '( r ) + ( e */V^ _ 1)2 7 "( r )/2 + ((e*/^ - l) 2 ). 

Using the expansion of e*/^ along powers of 1/y/x and dividing everything by 7(7"), 
one gets 



7(re*/^) =1 | ( r l'(r)\ t | / j {r) 
7(r) V 70) / V 7W 



(0 , r 27"(r)^ t 2 



(r) 7(f) / 2.t 



Note that 



r — — - = — m(r). 

7(r) 



Taking logarithms, writing the ratio of functions 7 as 

(j(re t ^)/j(r)y x = exp(-x log(7(re t / v/5 )/7(r))), 

and using the expansion log(l + z) = z — z 2 /2 + o(z 2 ) when z — o(l), one gets that 
F x (t) is equivalent to the exponential of 

— tvn(r)\fx — x (— m(r) t/^/x + (m,2(r) — m(r)) t 2 /(2x) - m(r) 2 t 2 /(2x) + o(l/x)) , 

where 



77X2 (r) = r 



2 l"(r) 
7(7-) ' 



Finally, F x (t) converges to e"' 2 ^* 1 * 2 / 2 , with 

tr 2 (r) = m(r) 2 + m(r) — ni2[r). 
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Using the definitions of m(r) and rri2(r) as functions of ~f{r) and its derivatives, one 
gets 

/ 7'(r)Y 
cr 2 (r) = r —r — ) = r m'(r). 
V 7l r ) J 

Using the formula for m(r) at the beginning of this section, one gets finally 

2 1 / r _ u(l-u) _ 2 

° W_ 4V(l + r) 3 " (1 + u) 3 _CV 

The proof is complete. 



5. Conditional hitting times 

5.1. Proof of theorem [3] 

We introduce the renewal process (Cx)x>i with increments (e x )x>1i that is 

Cx = inf{y > 1 ; <r y > x}. 

The usual central limit theorem for renewal processes states that (£ x — rax) / y/x con- 
verges in distribution to a centered Gaussian variable whose variance is the variance 
u(l — u) of every e x divided by the cube of the mean 1 + u of every e x , that is 
u(l-u)/(l + uf =o*. 

Our next lemma expresses the distribution of r\ x for every positive x in terms of the 
distributions of the random variables {C,^)\< z <x+\- 

Lemma 7. For every positive x and y, 

Hvx = y) = l _Y U ]x+1 Y^i-ufnCx+i-z = y + i). 

Proof of lemma^ Let x and y denote positive integers. We begin with the fact 
that 

{Cr+i = V + 1} = {°~y = x\ U {a y =x-l, e y+1 = 2}, 

hence 

P(<7„ = X) = P(Cx + l = V + 1) - U F{<7y =X~1). 

Iterating this recursion, one gets 

x-1 

P(H X ) F(r) x =y) = P(a v = x) = ^(-«) z P(( x+1 - Z = y + 1). 

z=0 
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Summing over every positive value of y and using the facts that P(Cz = 1) = if z > 3 
and that P(^2 = 1) = w, one gets 

F(H x ) = (-ur- i (i- U )+j2(- u y = 1 - ( - iiY 1 



1 + u 

2=0 



This concludes the proof. 



Lemma [JJ the fact that |u| < 1 and the convergence of the distribution of (( x — 
mx)/y/x, imply the same convergence for the distribution of (r] x — mx)/s/x. 

Finally, (£ x — mx) /y/x, (Cx — mx)j yfx and (j] x — rax) / yfx all converge in distribution 
to the same limit, which is the centered Gaussian distribution with variance a 2 . 

5.2. Sharp bounds 

Lemma 8. For every positive x, 

x + 1 1 + (-u) x+2 1 + u 2 



E(r? x ) 



1 + u l-(-u) x+1 {1 + u) 2 
For instance, 

E(Tfr) = l, E(?7 2 )=2- 



1 - u(l - u) 

For every positive integer x, one can deduce from the exact formula above that 

x 2u 2 . x 2u 

< E(r) x ) < —— + 



1 + u (l + u) 2 ~ y,J -\ + u {1 + u) 2 ' 

The width of the interval delimited by the upper and the lower bounds of above 
is 2u/{l + u) < 1. 

Bounds on E{r) x ), depending on the parity of follows. For every odd x, 

E{ Vx )>x/{l + u), 

and for every even x, 

Efe) < x/{l + u)+ u{l - u)/{l + u)<{x + l/4)/(l + u). 

These refined bounds yield intervals around E{r] x ), which depend on the parity of x, 
and whose width is always at most 2tt/(l + u) 2 < 1/2. 
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Proof of lemma\$[ Fix a positive integer x, a real number u in ]0, 1[, and let r = g(u). 

Let p v x — W(a y = x). Then rj x (P) is proportional to the measure ^^p% 8 V and, for every 

v 

positive y, 

Hence the distribution of r] x is jUj,(r, x), where fi v (r,x) = v^{r, x)/\v n (r, x)\ and 

v v {r,x)= £) (/J IV <V 



y=h(x) 



When u = 0, r = 00 and fj, v (oo,x) is the Dirac distribution at x. When u = 1, r = 
and /^(O, a;) is the Dirac distribution at /i(x). For the first values of x, the distributions 
/^(r, x) are as follows: /^(r, 1) = 8%, 

8x + 4r8 2 . „. 25 2 + 4r5 3 <5 2 + 12r5 3 + 16r 2 <5 4 

"' (r ' 2) = W M " M) = 2 + 4r ' = l + 12r + 16r» ' 

This implies that, for every positive x, 

nVx) = rg x (r)/g x (r), g x (r) = £ f * ) 4 V. 

To study the generating functions g Xl we introduce go(r) — 1 and 

G(r,z) = ^ 5;c (r) **. 

Summing first over y < x < 2y, then over y > 0, one gets 

G(r, z) = ^(4rzf (1 + zf = 1/(1 - 4rz(l + «)) = Ci(r, z). 

!/>0 

From the proof of lemma [4l one knows that the poles of C\{r,z) are z — j(r) and 
z = — 72(f) with 72(f) = 7(r) + 1, hence, 

G(r, z) = 1 ( ^ + _2^L_\ _ 

7( r ) + 72W \1 - z/7(Y) l + z/7 2 (r)/' 

This shows that, for every nonnegative x, 

/ s 7(^)72 (r) / , ,-( x+ i) , i ^-(x+i)\ 

From here, the expression of j(r) as a function of r and tedious computations of 
derivatives yield the result. 
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5.3. Comparison with a naive estimator 

For a given value u in ]0, 1[ and for a branching process Xo :oo with offspring distri- 
bution (1 — u)8\ + u52, when n converges to infinity, 

S n ~ X n (l + 1/(1 + u) + 1/(1 + uf + ■■■) = X n (l + l/u) almost surely, 

hence B{Xq :oo ) = u almost surely. The naive pointwise prediction of the mean initial 
population conditional on X\ = x, namely N u (x) = xj (1 + u), should be compared to 
the Bayesian prediction E u (t; x ) for r = g(u). For x = 2, one gets 

= (4u + 6(1 - uf) (1 + u) 
N u (2) 2(4u + 3(l -uf) ' 

This ratio is 1 when u = or u = 1, greater than 1 for every u in ]0, |[, and smaller 
than 1 for every u in ]|, 1[. Hence the naive and Bayesian predictions cannot be easily 
compared, at least on X\ = x for a given finite x. 
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